Skip to content

Prototype model-based fuzzer's replayer which can be used to reproduce uint tests#3631

Closed
dnkolegov-ar wants to merge 24 commits into
denis/simplex-quintfrom
denis/simplex-quint-local-replayer
Closed

Prototype model-based fuzzer's replayer which can be used to reproduce uint tests#3631
dnkolegov-ar wants to merge 24 commits into
denis/simplex-quintfrom
denis/simplex-quint-local-replayer

Conversation

@dnkolegov-ar
Copy link
Copy Markdown
Collaborator

No description provided.

Adds the two missing symbols referenced by the untracked canonical bins
so they build alongside the legacy TraceData pipeline:

- quint_model::validate_and_extract_expected_canonical(&Trace, &str)
  Trace-native counterpart of validate_and_extract_expected. Encodes via
  encoder::encode_from_trace, runs quint, parses ITF back through the
  shared decoder, and converts the resulting ExpectedState into a
  simplex::replay::Snapshot (Participant-keyed, with typed Digest
  payloads) so callers can assign it to Trace::expected directly.

- trace_mutator::canonical::run_canonical — stub. The Trace-native TLC
  feedback loop lands in Stage 5; this just unblocks the
  trace_mutator_canonical binary from building.

Also adds the four canonical binaries (generate_canonical_seeds,
replay_canonical_trace, trace_mutator_canonical,
validate_canonical_trace_corpus) to source control. validate /
replay / generate are fully functional; trace_mutator_canonical
panics with a pointer to Stage 5.
Adds tracing::record with four entry points paralleling
tracing::runtime's run_quint_*_tracing, but using the canonical
Recorder / RecordingReceiver / RecordingSender / RecordingApp wrappers
from simplex::replay. The four fuzz_targets (honest / byzantine /
twins / disrupter) are switched to call the new _recording variants.
The _tracing functions in runtime.rs are left untouched — tlc.rs
still uses run_honest_pipeline. Stage 6 retires runtime.rs entirely.

Output layout divergences from legacy:
- Traces land under artifacts/traces/<target>_canonical/ (single dir
  per target) instead of <target>_<strategy>/; canonical recording
  writes every trace unconditionally, so TraceSelectionStrategy has no
  meaning here. Filename is still sha1(corpus_bytes).
- Topology.timing captures the fuzz-specific timeouts (1s/2s for
  honest/byzantine, 1s/1.5s for twins) instead of Timing::default() so
  replay reproduces the recording conditions.
- Byzantine/twins Snapshot construction keys reporters at Participant
  index faults+slice_idx; the first `faults` participants contribute no
  NodeSnapshot (Byzantine nodes have no observable state by convention).

simplex_ed25519_quint_tlc_honest is unchanged — it uses the legacy
TLC feedback loop that still needs TraceData.
Rewrites fuzz/src/tracing/static_honest.rs to build simplex::replay::Trace
directly instead of legacy TraceData. Public API keeps the same names
(generate_small_honest_traces, write_small_honest_traces) so the binary
generate_small_honest_traces picks up the switch transparently.

Event sequence mirrors record_honest: per view, emit Propose + leader
Construct(Notarize) + 3x Deliver, then follower Construct/Deliver pairs,
then the notarization cert broadcast + 3x Construct(Finalize) + 3x Deliver
+ finalization cert broadcast (or the analogous nullify chain).

Signer counts in the analytic Snapshot are reported as Q=3 rather than
the full signer list size. This matches live replay behaviour (the
engine's batcher assembles its own cert the moment quorum is reached
and reports *that* cert regardless of the injected signer count) and
lets static_trace_replay_matches_expected pass with strict PartialEq.

Keys are sourced from the shared fixture (namespace=consensus_fuzz,
seed=0) so injected signatures verify during replay just like records
from tracing::record.

Output filename convention is `canonical_<sha1>.json`, matching
generate_canonical_seeds.

Four existing static_honest tests pass unchanged.
- itf_to_trace: reads the Quint ITF JSON, rehydrates the ed25519
  fixture via simplex::replay::rehydrate_keys, walks the ITF
  state diffs (decoder::diff_store_vote / diff_store_certificate)
  and synthesizes Event::Propose / Event::Construct / Event::Deliver
  with real signed Vote/Certificate values. Expected state is pulled
  via decoder::extract_expected_state and lifted to Snapshot via the
  newly-pub quint_model::expected_state_to_snapshot. Output is a
  single canonical Trace JSON per input ITF (no more sibling
  _expected.json — the Snapshot is embedded in Trace.expected).

- trace_to_quint: now reads Trace via Trace::from_json and delegates
  to encoder::encode_from_trace(&trace, required_containers) where
  required_containers = max(last_finalized) across expected.nodes
  (fallback 0).

- quint_model: expected_state_to_snapshot lifted from private to
  pub(crate) so itf_to_trace can reuse it.

CLI shape unchanged for both binaries. Timeout events are not yet
synthesized in itf_to_trace (the ITF state diffs don't cleanly expose
per-node timeouts); the encoder tolerates their absence.

Sanity-run end-to-end: quint test produced an ITF, itf_to_trace
emitted valid canonical Trace JSON (n=4 f=0 epoch=3 events=21),
trace_to_quint re-emitted the same trace as a .qnt module.

Legacy wiring that depended on the sibling _expected.json file
(mbt.mk::replay_itf) will need follow-up when Stage 10 updates the
makefiles.
Replaces the Stage-1 stub in trace_mutator::canonical::run_canonical
with a real TLC feedback loop that mirrors the legacy driver's logic
(load seeds, reseed periodically, mutate, submit to tlc-controlled,
dedup via states/traces/state_traces maps, gate on Quint validation,
persist keepers, amplify with mut_per_trace * num_new_states children)
but operates on simplex::replay::Trace throughout.

Wiring differences from legacy:
- Seeds loaded via Trace::from_json; persisted via Trace::to_json.
- TLC actions from tracing::tlc_encoder::encode_from_trace(&trace).
- Quint gate via quint_model::validate_and_extract_expected_canonical,
  whose returned Snapshot is assigned to trace.expected.
- Mutations via trace_mutator::canonical::mutate_once.
- Env namespace: CANONICAL_MUTATION_SEEDS_FOLDER /
  CANONICAL_MUTATED_TRACES_DIR (fall back to MUTATION_SEEDS_FOLDER /
  MUTATED_TRACES_DIR then built-in defaults). Shared knobs (TLC_URL,
  MUTATOR_ITERATIONS, MUTATOR_SEED, MUTATOR_MUT_PER_TRACE,
  MUTATOR_RESEED_FREQ, MUTATOR_SEED_POPULATION_SIZE, MUTATOR_FAULTS)
  reused verbatim. MUTATOR_FAULTS widens to u32 to match
  Topology.faults.
- PRNG seed file at .canonical_tlc_mutator_seed so canonical and
  legacy drivers can share a seed dir without clobbering each other.
- No automatic fixture copying; seed dir is caller-managed (use
  generate_canonical_seeds or copy canonical fixtures manually).

The legacy trace_mutator/mod.rs::run is unchanged. 14 canonical
mutator unit tests continue to pass.
…k to Trace

Demolishes the legacy TraceData pipeline and migrates every remaining
consumer to simplex::replay::Trace.

Part A — TLC-honest migration
  - tracing::record::run_honest_pipeline returns Option<Trace> via
    a shared build_honest_trace helper factored out of
    run_quint_honest_recording.
  - tlc::submit_trace takes &Trace and renders actions via
    tlc_encoder::encode_from_trace.
  - tlc::run_quint_tlc_honest_model calls the new pipeline.
  - TlcMapper + the TraceData/EncoderConfig path in tlc.rs are gone.

Part B — Legacy demolition
  Files deleted:
    - tracing/{data,decoder,runtime,sniffer}.rs
    - tracing/tests/ (incl. byzantine/twins/honest JSON fixtures)
    - bin/{trace_mutator,replay_trace,validate_trace_corpus,
           convert_trace}.rs
    - replayer/ (whole directory: mod, compare, automaton, injected,
      messages)

  Files pruned:
    - tracing/encoder.rs — removed TraceData path (encode, build_
      action_items, build_block_map, VoteReplayKey, ...). Kept
      encode_from_trace, build_block_map_from_events,
      build_proposals_from_events, lower_events_to_actions,
      cert_to_item_canonical, snapshot_to_reporter_states,
      render_quint_*. Replaced the external ReporterReplicaStateData /
      TraceProposalData types (formerly in tracing::data) with
      internal ReporterStateData / ProposalData. Canonical tests kept.
    - tracing/tlc_encoder.rs — dropped encode(&TraceData) and
      TlcMapper; only encode_from_trace remains.
    - trace_mutator/mod.rs — shrunk from ~1745 LOC to ~40 LOC
      (re-exports canonical and find_json_files).
    - quint_model.rs — pruned TraceData entry points. ExpectedState
      / ExpectedNodeState (formerly in replayer/compare.rs) moved in.
      identify_correct_nodes / extract_expected_state (formerly in
      tracing::decoder) moved in. Canonical
      validate_and_extract_expected_canonical keeps its own inline
      ITF parser.
    - lib.rs — removed run_quint_*_tracing wrappers; dropped
      pub mod replayer. Kept pub mod apalache/ist and the
      _recording wrappers.
    - Cargo.toml — removed [[bin]] entries for trace_mutator,
      replay_trace, validate_trace_corpus, convert_trace.

Part C — apalache/ist/tlc_check/itf_to_trace migrated to Trace
  - apalache.rs — unchanged (pure JSON-RPC client, no TraceData
    dependency).
  - ist.rs — full migration. Replacement ExpectedState /
    extract_expected_state / identify_correct_nodes come from
    quint_model; injected channels (NullBlocker, NullSender,
    PendingReceiver) pulled from simplex::replay::injected;
    Mismatch/compare + message construction re-inlined.
  - bin/ist.rs — unchanged (IstConfig + IstReport signatures
    preserved).
  - bin/tlc_check.rs — loads Trace::from_json and calls
    tlc_encoder::encode_from_trace(&Trace) directly.
  - bin/itf_to_trace.rs — imports updated to pull ITF helpers from
    quint_model.

  Cargo.toml: [[bin]] entries for ist, tlc_check, itf_to_trace
  restored.

Tests: 24 lib tests pass. cargo build --all-targets clean.
Now that TraceData is gone there's no reason to keep the "_canonical"
suffix. Four binary files and a handful of symbols + env vars are
renamed back to the plain names the pre-refactor legacy pipeline used.

Binary renames (git mv, so history follows):
  trace_mutator_canonical.rs      -> trace_mutator.rs
  replay_canonical_trace.rs       -> replay_trace.rs
  validate_canonical_trace_corpus -> validate_trace_corpus.rs
  generate_canonical_seeds.rs     -> generate_seeds.rs

Cargo.toml: [[bin]] entries for the four new plain names added.

Symbol renames:
  trace_mutator::canonical::run_canonical              -> ::run
  quint_model::validate_and_extract_expected_canonical -> ::validate_and_extract_expected
  Several private helpers inside trace_mutator::canonical lose their
  _canonical suffix (seed_dir, output_dir, load_seeds, ...).

Env vars:
  CANONICAL_MUTATION_SEEDS_FOLDER dropped; MUTATION_SEEDS_FOLDER
  is now the only name. Default changed:
    artifacts/canonical_tlc_mutator -> artifacts/tlc_mutator
  CANONICAL_MUTATED_TRACES_DIR dropped; MUTATED_TRACES_DIR is now
  the only name. Default changed:
    artifacts/canonical_mutated_traces -> artifacts/mutated_traces
  Persisted PRNG seed filename:
    .canonical_tlc_mutator_seed -> .tlc_mutator_seed

The `pub mod canonical;` module name under trace_mutator is kept
as-is (renaming the module would require moving file contents into
mod.rs, which is more invasive than the benefit justifies for now).

Doc comments referencing the legacy TraceData mutator are removed
from trace_mutator/{canonical,mod}.rs headers.

Build clean; 24 lib tests pass.
Adds 5 canonical Trace JSON fixtures under
fuzz/src/tracing/tests/fixtures/ and wires them into the
trace_mutator seed-dir bootstrap so a fresh MUTATION_SEEDS_FOLDER
populates itself on first run.

Fixtures (all round-trip cleanly through simplex::replay::replay):
  honest/honest_<sha1>.json     (x2, namespaces fixture_honest_{0,1})
  byzantine/byzantine_<sha1>.json (x2, namespaces fixture_byzantine_{0,1})
  twins/twins_placeholder_<sha1>.json (x1, namespace fixture_twins_0)

Byzantine + twins fixtures use record_honest with
trace.topology.faults = 1 set post-hoc and trace.expected
regenerated via replay(&trace); the replayer skips events
targeting Byzantine indices so the embedded snapshot reflects
the 3 honest nodes correctly. Twins placeholder is structurally
valid for the mbf pipeline's dynamic-faults model.

trace_mutator/canonical.rs::run now calls bootstrap_fixtures
before load_seeds. bootstrap_fixtures:
  - walks fixtures/{honest,byzantine,twins}/
  - filters by MUTATOR_FAULTS if set (double-checks against
    each fixture JSON's topology.faults)
  - copies sha1-named files into the seed dir idempotently
  - warns + continues on copy errors

Adds:
  - #[test] #[ignore] write_fixture_set — source of truth for
    fixture regeneration (clears subdirs, rebuilds via
    record_honest, asserts replay round-trip)
  - #[test] bootstrap_fixtures_copies_matching_and_skips_others
    — verifies faults filter + idempotency

25 passing tests (1 ignored).
After Stages 1-8 the fuzz crate only produces / consumes canonical
Trace JSON. The makefiles are updated to match:

Top-level Makefile:
- MUTATION_SEEDS_FOLDER default -> ../fuzz/artifacts/tlc_mutator
  (matches Stage 7's canonical rename).
- Dropped TRACE_TARGET / TRACE_TARGETS / TRACE_SELECTION_STRATEGY /
  MIN_REQUIRED_CONTAINERS / MAX_REQUIRED_CONTAINERS (no Rust consumer).

tv.mk:
- TRACE_TARGET / TRACE_TARGETS defaults now live here, space-separated.
- tv_live_fuzz: iterates TRACE_TARGETS as a plain list; dropped the
  [a,b,c] parser.
- Dropped TRACE_SELECTION_STRATEGY / MIN_ / MAX_REQUIRED_CONTAINERS
  env exports from all targets.
- Fixed tv_live_watch pointing at undefined TEST_DIR; now uses
  TRACES_DIR.

mbt.mk:
- replay_itf: single-arg replay_trace; dropped sibling _expected.json
  handling (Snapshot is now embedded in the Trace).

mbf.mk:
- mutate_traces: dropped MUTATOR_DEBUG and redundant path export.
- replay_mutated_traces, mbf_live_watch: dropped REPLAY_FAULTS and
  *_expected.json skip (replayer reads faults from the Trace; no
  sibling files exist).
- mbf_live_trace_fuzz_gen, mbf_augment_seeds_from_fuzz: src path
  switched from <target>_$(TRACE_SELECTION_STRATEGY) to
  <target>_canonical (matches Stage 2's recorder output layout).

make -n spot-checks pass for the full set of flow entry points
(get_fuzz_traces, build_quint_tests_from_fuzz_traces, tv, tv_live,
replay_itf, mutate_traces, mbf_live*, mbf_augment_seeds_from_fuzz,
conformance_roundtrip).

Known pre-existing flags not addressed: mbt.mk::generate_itf_test
overwrites a single trace file across iterations; scripts/
test_traces.sh + watch_new_traces.sh still honour TRACE_SELECTION_
STRATEGY for their own sample-count gating (harmless; defaults to
"current").
         Byzantine filtering, and local-store semantics

Three distinct bugs surfaced while fuzzing the tv flow. Ship all
three together since each one individually only advances the trace
to the next mismatch.

1) build_leader_map_to now reads leaders from Event::Propose in the
   trace and only falls back to (epoch + view) % n for views that
   have no Propose. Fixes QNT513 on twins traces (and any future
   non-identity-permutation elector, e.g. RoundRobin::shuffled or
   Random/VRF): the encoder previously assumed the permutation was
   identity, which is only true for the default RoundRobin.

2) lower_events_to_actions takes faults: u32 and applies two
   mandatory filters for traces with faults>0:
   - Deliver with to.get() < faults: skip OnNotarize/OnNullify/
     OnFinalize/OnCertificate (Byzantine receivers aren't modeled
     in Quint; replica_state.get("n0") panics). SendCertificate
     still fires before skipping the OnCertificate since it
     represents network availability, not receiver state.
   - Propose with leader.get() < faults: drop entirely. The
     Byzantine leader's notarize vote is already introduced by the
     following Event::Construct, which lowers to a
     SendNotarizeVote barrier that carries the proposal.

3) The generated send_notarize_vote / send_nullify_vote /
   send_finalize_vote helpers in the .qnt now mirror the voter's
   local store when vote.sig is correct:
       store_*_votes' = if (CORRECT.contains(vote.sig))
           store_*_votes.set(vote.sig,
               store_*_votes.get(vote.sig).union(Set(vote)))
           else store_*_votes
   Before this, send_nullify_vote("n1") only updated the network's
   sent_nullify_votes. When n1 then received n2 + n3 nullifies it
   saw only {n2,n3} in its local store, could not form a quorum,
   and could not enter view v+1. This matches Rust replay where
   Event::Construct for a correct node updates that node's state.

Regression coverage added under tracing::encoder::canonical_tests:
- byz_receiver_deliveries_are_filtered_from_quint
- byz_leader_propose_is_filtered_from_quint
- byz_receiver_deliveries_are_filtered_from_tlc
- send_notarize_vote_helper_updates_signer_local_store
- send_nullify_vote_helper_updates_signer_local_store
- send_finalize_vote_helper_updates_signer_local_store

31 lib tests pass (was 25; +6 new).
The ITF extraction still read the legacy monolithic names
`store_certificate` / `store_vote`, which have been split in the
current replica.qnt into:
  - store_certificates
  - store_notarize_votes / store_nullify_votes / store_finalize_votes

Consequence: validate_trace_corpus accepted traces whose embedded
expected Snapshot had empty notarizations / finalizations / certified
and signer sets containing only each node's own vote (the sole
surviving data came from `sent_*_votes` extraction, which already
handled the split). Replay then mismatched against a wrong snapshot
that didn't reflect delivered peer votes or quorum-formed certs.

collect_store_certificate now reads store_certificates first and
falls back to store_certificate. collect_store_vote reads the three
split maps and tags each vote with its kind ("Notarize" | "Nullify"
| "Finalize") so `insert_vote_signer` sees the same shape
`sent_*_votes` already produced. Legacy `store_vote` support retained
as fallback for old fixtures.

Verified with the user-reported failing trace
(static_generated_ea97243a...): revalidated via
validate_trace_corpus, embedded Snapshot now contains full observed
signer sets plus the formed notarization/finalization certs; replay
goes from MISMATCH to MATCH.
…e_trace_corpus

## Fuzz-recorded traces were not replayable

fuzz/src/tracing/record.rs drove the deterministic runtime with the
FuzzRng seeded from the input bytes, and that context's RNG was then
consumed by P::fixture to derive the ed25519 keyset. On the replay
side, simplex::replay::trace::rehydrate_keys always uses
Runner::seeded(0). The two RNG streams diverge, so fuzz-canonical
traces embedded signed payloads whose verifier could not be
reconstructed by replay. The result was replayable-looking JSON with
silently-broken signatures and Snapshot mismatches that only showed
up when traces passed through Quint + replay.

## Fix

- lib.rs: split setup_network into setup_network_with_fixture
  (takes a pre-built Fixture) + a thin setup_network wrapper that
  preserves the original FuzzRng-derived fixture for non-recording
  paths (pure fuzz invariants).
- tracing/record.rs: new `replay_fixture(n, namespace)` helper that
  mirrors `rehydrate_keys` exactly — nested Runner::seeded(0) +
  ed25519::fixture over the same (namespace, n). All three canonical
  recording entry points now pre-derive the fixture this way and
  call setup_network_with_fixture, so the keys written into each
  Trace match what replay will regenerate.
- New unit regression tracing::record::tests::honest_recording_
  roundtrips_through_rust_replay: runs run_honest_pipeline, replays,
  asserts replay(&trace) == trace.expected.

## Rust replay gate in validate_trace_corpus

Even with the fixture fix, any later divergence between Quint's
extracted Snapshot and Rust's replay Snapshot would silently slip
into the mbf seed pool. validate_trace_corpus now runs replay(&trace)
after Quint validation and rejects candidates whose expected
snapshot != replay snapshot. Can be bypassed with SKIP_REPLAY_GATE=1
for debugging.

Stale fuzz-canonical trace directories under
fuzz/artifacts/traces/simplex_ed25519_quint_*_canonical/ and
fuzz/artifacts/mutated_traces are untrusted after this commit and
have been cleaned locally; they regenerate cleanly on the next fuzz
run now that keys line up.
- Byzantine recording (run_quint_byzantine_recording) and twins
  recording (run_quint_twins_recording) still called the
  FuzzRng-derived setup_network. Under MBF_FAULTS>0 or for the
  disrupter fuzz target, traces remained non-replayable in the same
  way honest ones were before the previous commit. Both paths now
  use replay_fixture + setup_network_with_fixture, matching the
  honest path.
- Tighten setup_network_with_fixture visibility to pub(crate); it's
  only used inside the fuzz crate and its return type mentions the
  private NetworkChannels alias.
- Replace em dash with a comma in tracing/record.rs:51 per the repo
  style guide.
The previous -Xmx4g default OOM'd the JVM mid-/execute for the simplex
state space. The crash happened silently because tlc.sh runs with
stdio piped out by the mbf_live_fuzz wrapper, so the Rust client saw
only a transport-level "error sending request for url" on the first
/execute call (JVM dead before responding).

Raise the default to 16g and expose JVM_MAX_HEAP / JVM_STACK_SIZE
env vars so host machines with different RAM profiles can tune without
editing the script. Usage block documents both.
The controlled TLC server's simulate(..., is_reset=true) loop only
stops on a reset/quit/unknown terminator. Without one it calls
actionsToRun.remove() on an empty queue, throws
NoSuchElementException, then NPEs in the catch block on
e.getMessage().length() because NoSuchElementException.getMessage()
is null. Jetty drops the connection, and the Rust reqwest client
sees only a transport-level "error sending request" with no server
status or body.

tlc::submit_trace already did the right thing. The Stage-5 port of
trace_mutator::canonical::run missed it, and bin/tlc_check.rs
didn't have it either. Every path is now routed through a shared
tlc::terminate_actions(&mut Vec<Value>) helper so future
/execute-posting code picks it up by default.

32 lib tests pass.

(Out of scope: hardening the Java handler to use String.valueOf(
e.getMessage()) so future TLC-side errors surface cleanly instead of
silently closing the connection.)
…ding

Walks through the three top-level fields (topology, events, expected),
the four Event variants (Deliver/Construct/Propose/Timeout), how
events get produced by the recorder, and the three-stage Quint
lowering pipeline (block map, ActionItem lowering with Byzantine
filtering, .qnt rendering with leader-map from events and
self-store-mirroring send_*_vote helpers). Ends with a worked example
of one view of an honest 4-node trace.
## Selection strategies

Canonical recorder was admitting every recorded trace, blowing up
fuzz/mbf seed pools with near-duplicates. Port the pre-Lane-B
strategy machinery (SmallScope, LoF, Current) to operate on canonical
Events in a new tracing::selection module:

- TraceMetrics::from_trace walks Event::{Construct, Deliver, Propose,
  Timeout} and builds per-view signer vectors [notarize | nullify |
  finalize votes by each correct node | notarization |
  nullification | finalization cert senders]. Votes are counted via
  Construct (one broadcast per unique vote); certs via Deliver.
- SmallScope: admits a candidate iff its per-view signature is
  novel AND its signature length is >= the longest seen. Default.
- LoF: admits candidates whose k-NN local outlier factor exceeds
  1.5 over the 5 nearest seen signatures.
- Current: admits anything with >=1 cert and >1 unique payload.
- Default is smallscope (honors TRACE_SELECTION_STRATEGY env var).
- Disrupter traces pass filter_n0=true so n0's certificate
  broadcasts aren't counted as novelty (n0 is the disrupter).
- Output dir becomes artifacts/traces/<target>_<strategy>/ (was
  _canonical literal).

record.rs::persist_trace now delegates to
selection::select_and_persist, which also streams a per-strategy
fuzz.log with each selected trace's full metrics + view signatures.

## Mutator driver refill

Separately: the Stage-5 trace_mutator::canonical::run drained its
candidate queue into a fatal exit when all fallback single-shot
mutations returned None. That's misplaced: mutate_once legitimately
returns None when a random mutation violates
preserves_first_broadcast_order (~30-40% per-attempt success rate),
and the fallback only tried base_seeds.len() attempts (2 in the
honest case).

Two fixes:
- mutate_once now retries up to 32 random mutations internally
  before returning None. One unlucky pick no longer loses the
  whole attempt.
- Driver queue-empty now refills via seed_queue_generated (full
  population) before any fallback. If the refill still produces
  nothing, it does seed_population_size.max(64) single-shot
  attempts before exiting.

Makefile:
- Top-level Makefile declares + exports TRACE_SELECTION_STRATEGY
  (default smallscope).
- mbf.mk rewrites all <target>_canonical hardcodes to
  <target>_$(TRACE_SELECTION_STRATEGY) so artifact paths track the
  recorder's strategy.
- Add .mbf_model_tmp/ (ITF validation scratch dir) to
  consensus/quint/.gitignore.

32 lib tests pass; no regressions.
Replace the process::exit(1) fallback in the queue-drain path with a
three-tier recovery:
  1. seed_queue_generated refill (population-sized batch).
  2. exhaustive_candidate_lookup: shuffled per-base-seed sweep with
     EXHAUSTIVE_PER_SEED_BUDGET (128) independent mutate_once attempts
     per seed. Short-circuits on first success. Fair across seeds, so
     an unlucky RNG can't starve live seeds while one specific seed
     dominates the retry loop.
  3. Only if every seed has consumed its full budget: log a single
     diagnostic line and `break` out of the iteration loop. The
     driver still runs its final summary (kept / uninteresting /
     empty_actions / errors / states / traces). mbf_live / mbf_live_
     watch processes are not torn down.

Rationale: mutate_once legitimately fails ~30-40% of attempts when
the chosen random mutation violates preserves_first_broadcast_order.
With base_seeds.len() attempts a pathological run can randomly hit a
long streak of failures even when every seed has plenty of
admissible mutations. Giving each seed its own bounded budget
distinguishes "probabilistic tail" from "this corpus can't be
mutated at all".

15 mutator tests pass.
## 10 new mutations ported from the legacy TraceData mutator

Operate exclusively on Event::Deliver entries — Propose / Construct /
Timeout are skeleton and stay put; no wire content is modified.

Added variants:
  - SwapByRecipient, BatchSplit, FanoutSkew, ChannelSkew, BurstRelease,
    InterleaveReceivers, TimeoutEdge, RelayPreference  (honest-safe)
  - Swap, Duplicate                                   (Byzantine-only)

mutation_candidates(&trace) returns HONEST_MUTATIONS for faults == 0
and ALL_MUTATIONS otherwise. apply_mutation() now takes the candidate
set as a parameter; mutate_once() plumbs it through. The
HONEST_MUTATIONS doc comment is explicit that membership does NOT
imply invariant-safety by construction — several members can violate
preserves_first_broadcast_order on some inputs and are made safe only
by mutate_once's retry gate.

## Determinism fix for MUTATOR_SEED reproducibility

group_deliver_indices_by now returns BTreeMap<K, Vec<usize>>; Channel
derives Ord. delay_link / delay_sender / delay_recipient also
switched from HashMap to BTreeMap. Iterating these maps and picking
via RNG previously went through Rust's per-process-randomized
HashMap order, so the same MUTATOR_SEED could produce different
traces across runs.

## Graceful queue exhaustion + fair per-seed budget

When the driver's queue drains, we first refill via
seed_queue_generated, then fall back to exhaustive_candidate_lookup:
shuffled per-base-seed sweep with 128 mutate_once attempts each.
process::exit(1) is replaced with a clean break so mbf_live /
mbf_live_watch keep running.

## Module rename

canonical.rs is folded into trace_mutator/mod.rs (the old thin mod.rs
is gone). Consumers can now `use trace_mutator::{run, mutate_once,
Mutation, ...}` directly. find_json_files, which previously existed
in both files, is kept at the module root.

## Tests

29 tests (up from 18):
  - 10 per-mutation regressions (Swap, Duplicate, SwapByRecipient,
    BatchSplit, FanoutSkew, ChannelSkew, BurstRelease,
    InterleaveReceivers, TimeoutEdge, RelayPreference) asserting
    non-Deliver skeleton is preserved and (for honest-safe mutations
    with repeatable-broadcast inputs) first-broadcast-order holds.
  - delay_sender_is_reproducible_under_fixed_rng: the determinism
    regression covering the BTreeMap change — identical RNG seed
    produces byte-identical mutation output across runs.
  - queue_exhaustion_returns_none_not_process_exit: regression
    covering graceful recovery from a degenerate seed corpus.

## Style

Em dashes in comments replaced with hyphens per repo guidelines.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
commonware-mcp 70792d1 Apr 17 2026, 10:54 PM

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 70792d1. Configure here.

&mut block_hex_to_digest,
);
propose_events.extend(events.drain(..));
events = propose_events;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Propose events placed before Deliver/Construct events breaking ordering

High Severity

The propose_events vector (all Propose events sorted by view) has ALL Construct/Deliver events appended after it via extend(events.drain(..)). This means every Propose event for every view is grouped at the start, followed by all Construct/Deliver events. The canonical trace format requires causal ordering where Propose(view=V) precedes the leader's Construct(Notarize, view=V) and subsequent Deliver events for that view. The replay driver processes events sequentially, so placing Propose(view=3) before view 1's Construct/Deliver events breaks the expected interleaving and will cause replay failures.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 70792d1. Configure here.

Comment thread consensus/fuzz/src/ist.rs
}
}
new_entries
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Massive ITF helper code duplicated across three files

Medium Severity

About 400+ lines of ITF decode helpers (get_var, parse_int, parse_set, parse_map, block_to_hex, extract_leader_map, compute_epoch, count_nodes, collect_store_vote, collect_store_certificate, diff_store_vote, diff_store_certificate, parse_itf_vote, parse_itf_cert, and associated types like TracedVote, TracedCert, TraceEntry) are copy-pasted identically into both ist.rs and itf_to_trace.rs, while equivalent implementations already exist in quint_model.rs. This triples the maintenance surface for any fix or change to ITF parsing logic.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 70792d1. Configure here.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.63%. Comparing base (3fa68d6) to head (70792d1).
⚠️ Report is 101 commits behind head on denis/simplex-quint.

@@                   Coverage Diff                   @@
##           denis/simplex-quint    #3631      +/-   ##
=======================================================
- Coverage                95.80%   95.63%   -0.18%     
=======================================================
  Files                      439      431       -8     
  Lines                   165224   152636   -12588     
  Branches                  3859     3636     -223     
=======================================================
- Hits                    158296   145973   -12323     
+ Misses                    5718     5508     -210     
+ Partials                  1210     1155      -55     
Files with missing lines Coverage Δ
consensus/src/simplex/actors/voter/actor.rs 96.76% <ø> (-0.84%) ⬇️
consensus/src/simplex/actors/voter/ingress.rs 81.48% <ø> (-18.52%) ⬇️
consensus/src/simplex/actors/voter/mod.rs 91.62% <ø> (+<0.01%) ⬆️
consensus/src/simplex/actors/voter/round.rs 99.35% <ø> (-0.26%) ⬇️
consensus/src/simplex/actors/voter/state.rs 98.57% <ø> (-0.22%) ⬇️
consensus/src/simplex/engine.rs 97.60% <ø> (+0.05%) ⬆️
consensus/src/simplex/metrics.rs 100.00% <ø> (ø)
consensus/src/simplex/mod.rs 98.50% <ø> (+0.14%) ⬆️
consensus/src/simplex/replay/automaton.rs 95.08% <ø> (ø)
consensus/src/simplex/replay/driver.rs 97.26% <ø> (ø)
... and 4 more

... and 193 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3fa68d6...70792d1. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant